34 research outputs found

    Logical and Algebraic Characterizations of Rational Transductions

    Full text link
    Rational word languages can be defined by several equivalent means: finite state automata, rational expressions, finite congruences, or monadic second-order (MSO) logic. The robust subclass of aperiodic languages is defined by: counter-free automata, star-free expressions, aperiodic (finite) congruences, or first-order (FO) logic. In particular, their algebraic characterization by aperiodic congruences allows to decide whether a regular language is aperiodic. We lift this decidability result to rational transductions, i.e., word-to-word functions defined by finite state transducers. In this context, logical and algebraic characterizations have also been proposed. Our main result is that one can decide if a rational transduction (given as a transducer) is in a given decidable congruence class. We also establish a transfer result from logic-algebra equivalences over languages to equivalences over transductions. As a consequence, it is decidable if a rational transduction is first-order definable, and we show that this problem is PSPACE-complete

    Streamability of nested word transductions

    Full text link
    We consider the problem of evaluating in streaming (i.e., in a single left-to-right pass) a nested word transduction with a limited amount of memory. A transduction T is said to be height bounded memory (HBM) if it can be evaluated with a memory that depends only on the size of T and on the height of the input word. We show that it is decidable in coNPTime for a nested word transduction defined by a visibly pushdown transducer (VPT), if it is HBM. In this case, the required amount of memory may depend exponentially on the height of the word. We exhibit a sufficient, decidable condition for a VPT to be evaluated with a memory that depends quadratically on the height of the word. This condition defines a class of transductions that strictly contains all determinizable VPTs

    Streaming Tree Automata

    Get PDF
    International audienceStreaming validation and querying of XML documents are often based on automata for tree-like structures. We propose a new notion of streaming tree automata in order to unify the two main approaches, which have not been linked so far: automata for nested words or equivalently visibly pushdown automata, and respectively pushdown forest automata

    Aperiodicity of Rational Functions Is PSPACE-Complete

    Get PDF

    On Canonical Models for Rational Functions over Infinite Words

    Get PDF
    This paper investigates canonical transducers for rational functions over infinite words, i.e. functions of infinite words defined by finite transducers. We first consider sequential functions, defined by finite transducers with a deterministic underlying automaton. We provide a Myhill-Nerodelike characterization, in the vein of Choffrut’s result over finite words, from which we derive an algorithm that computes a transducer realizing the function which is minimal and unique (up to the automaton for the domain). The main contribution of the paper is the notion of a canonical transducer for rational functions over infinite words, extending the notion of canonical bimachine due to Reutenauer and SchĂŒtzenberger from finite to infinite words. As an application, we show that the canonical transducer is aperiodic whenever the function is definable by some aperiodic transducer, or equivalently, by a first-order transduction. This allows to decide whether a rational function of infinite words is first-order definable.SCOPUS: cp.pinfo:eu-repo/semantics/publishe

    Early Nested Word Automata for XPath Query Answering on XML Streams

    Get PDF
    International audienceolynomial time for disjunctions of k-bounded simpl

    Flux XML, RequĂȘtes XPath et Automates

    No full text
    During the last years, XML has evolved into the quasi standard format for data exchange. Most typically, XML documents are produced from databases, during document processing, and for Web applications. Streaming is a natural exchange mode, that is frequently used when sending large amounts of data over networks, such as in database driven Web applications. Streaming is thus relevant for many XML processing tasks.In this thesis, we study streaming algorithms for XML query answering. Our main objective lies in efficient memory management, in order to be able to query huge data collections with low memory consumption. This turns out to be a surprisingly complex task, which requires serious restrictions on the query language. We therefore consider queries defined by deterministic automata or in fragments of the W3C standard language XPath, rather than studying more powerful languages such as the W3C standards XQuery or XSLT.We first propose Streaming Tree Automata (STAs) that operate on unranked trees in streaming order, and prove them equivalent to Nested Word Automata and to Pushdown Forest Automata. We then contribute an earliest query answering algorithm for query defined by deterministic STAs. Even though it succeeds to store only alive answer candidates, it consumes only PTIME per event and candidate. This yields positive streamability results for classes of queries defined by deterministic STAs. The precise streamability notion here relies on a new machine model that we call Streaming Random Access Machines (SRAMs), and on the number of concurrently alive candidates of a query. We also show that bounded concurrency is decidable in PTIME for queries defined by deterministic STAs. Our proof is by reduction to bounded valuedness of recognizable tree relations.Concerning the W3C standard query language XPath, we first show that small syntactic fragments are not streamable except if P=NP. The problematic features are non-determinism in combination with nesting of and/or operators. We define fragments of Forward XPath with schema assumptions that avoid these aspects and prove them streamable by PTIME compilation to deterministic STAs.Ces derniĂšres annĂ©es, XML est devenu le format standard pour l'Ă©change de donnĂ©es. Les documents XML sont gĂ©nĂ©ralement produits Ă  partir de bases de donnĂ©es, durant le traitement de documents, ou au sein d'applications Web. L'Ă©change de donnĂ©es en flux est frĂ©quemment utilisĂ© lors de l'envoi de donnĂ©es volumineuses par le rĂ©seau. Ainsi le transfert par flux est adĂ©quat pour de nombreux traitements XML.Dans cette thĂšse, nous Ă©tudions des algorithmes d'Ă©valuation de requĂȘtes sur des flux XML. Notre objectif est de gĂ©rer efficacement la mĂ©moire, afin de pouvoir Ă©valuer des requĂȘtes sur des donnĂ©es volumineuses, tout en utilisant peu de mĂ©moire. Cette tĂąche s'avĂšre complexe, et nĂ©cessite des restrictions importantes sur les langages de requĂȘtes. Nous Ă©tudions donc les requĂȘtes dĂ©finies par des automates dĂ©terministes ou par des fragments du standard W3C XPath, plutĂŽt que par des langages plus puissants comme les standards W3C XQuery et XSLT.Nous dĂ©finissons tout d'abord les Streaming Tree Automata (STAs), qui opĂšrent sur les arbres d'aritĂ© non bornĂ©e dans l'ordre du document. Nous prouvons qu'ils sont Ă©quivalents aux Nested Word Automata et aux Pushdown Forest Automata. Nous Ă©laborons ensuite un algorithme d'Ă©valuation au plus tĂŽt, pour les requĂȘtes dĂ©finies par des STAs dĂ©terministes. Bien qu'il ne stocke que les candidats nĂ©cessaires, cet algorithme est en temps polynomial Ă  chaque Ă©vĂ©nement du flux, et pour chaque candidat. Par consĂ©quent, nous obtenons des rĂ©sultats positifs pour l'Ă©valuation en flux des requĂȘtes dĂ©finies par des STAs dĂ©terministes. Nous mesurons une telle adĂ©quation d'un langage de requĂȘtes Ă  une Ă©valuation en flux via un nouveau modĂšle de machines, appelĂ©es Streaming Random Access Machines (SRAMs), et via une mesure du nombre de candidats simultanĂ©ment vivants, appelĂ© concurrence. Nous montrons Ă©galement qu'il peut ĂȘtre dĂ©cidĂ© en temps polynomial si la concurrence d'une requĂȘte dĂ©finie par un STA dĂ©terministe est bornĂ©e. Notre preuve est basĂ©e sur une rĂ©duction au problĂšme de la valuation bornĂ©e des relations reconnaissables d'arbres.Concernant le standard W3C XPath, nous montrons que mĂȘme de petits fragments syntaxiques ne sont pas adaptĂ©s Ă  une Ă©valuation en flux, sauf si P=NP. Les difficultĂ©s proviennent du non-dĂ©terminisme de ce langage, ainsi que du nombre de conjonctions et de disjonctions. Nous dĂ©finissons des fragments de Forward XPath qui Ă©vitent ces problĂšmes, et prouvons, par compilation vers les STAs dĂ©terministes en temps polynomial, qu'ils sont adaptĂ©s Ă  une Ă©valuation en flux

    Streaming tree automata and XPath

    No full text
    L'intĂ©rĂȘt croissant pour les technologies Web gĂ©nĂšre de nouveaux dĂ©fis. Le format XML s'est imposĂ© comme une rĂ©fĂ©rence pour le stockage et l'Ă©change de donnĂ©es. Certains documents XML ont acquis une taille telle, qu'il est inefficace voire impossible de les stocker en mĂ©moire centrale. Cela amĂšne Ă  repenser les algorithmes prĂ©vus pour traiter ces documents. Une solution consiste Ă  considĂ©rer un document XML comme un flux, qui correspond Ă  une lecture unidirectionnelle de ce document. Ce flux est alors traitĂ© Ă  la volĂ©e. Ainsi le document n'est jamais stockĂ© en mĂ©moire centrale, et uniquement les parties utiles y sont mĂ©morisĂ©es. L'un des traitements effectuĂ©s sur les fichiers XML est la sĂ©lection d'information par des requĂȘtes. Ceci constitue une Ă©tape de base pour la transformation de documents XML, permettant ainsi Ă  des applications utilisant diffĂ©rents schĂ©mas XML d'Ă©changer des informations. Cette thĂšse Ă©tudie l'Ă©valuation de requĂȘtes sur des flux XML. Deux formalismes de requĂȘtes sont considĂ©rĂ©s· le standard XPath, et les automates d'arbres Pour cela, une mesure de la facultĂ© d'une requĂȘte Ă  ĂȘtre Ă©valuĂ©e sur des flux XML est introduite. A l'aune de cette mesure, les requĂȘtes XPath et par automates ne sont pas adaptĂ©es Ă  une Ă©valuation de flux XML. Pour chacun des deux formalismes de requĂȘtes, de larges fragments adaptĂ©s Ă  ce type d'Ă©valuation sont dĂ©finis et Ă©tudiĂ©s. Pour les requĂȘtes par automates d'arbres, deux autres critĂšres liĂ©s Ă  l'Ă©valuation de flux XML sont montrĂ©s dĂ©cidables en temps polynomialThe growing interest for Web technologies leads to new challenges. XML is now a reference for storing and exchanging data. Some XML documents are now so large, that il is inefficient or even impossible to store them in main memory. This calls for new paradigms to treat these data. One of them consists in considering an XML document as a stream, corresponding to a one-way reading of this document. This stream is then processed on-the-f1y. Hence the document is never stored in main memory, and only the useful parts are memorized. One task of XML processing is to retrieve information, using queries. This is the base step for XML document transformation, that allows applications using distinct XML schemas to exchange data. This thesis studies the query answering problem on XML streams. Two query classes are considered: the XPath standard, and tree automata. For this purpose, a measure of streamability of a query is introduced. This one shows that queries defined by XPath expressions or tree automata are not streamable. For both query formalisms, large streamable fragments are introduced and studied. For queries defined by tree automata, Iwo other streamability criteria are proved to be decidable in polynomial time

    Transductions: resources and characterizations

    No full text
    Transducers define word-to-word transformations by extending automata with outputs. We study some decision problems related to transducers.First, we characterize some resources required by any functional transducer implementing a given transformation. We begin with two algorithms determining whether a two-way functional transducer has an equivalent one-way transducer, and synthesizing it in this case. If the transducer is not one-way definable, another algorithm permits to decide whether it can perform its reversals only at the borders of the input word (sweeping transducers), and determine the minimal number of passes over the input. A side result is the minimization of the number of registers of a particular class of streaming string transducers, a model of one-way transducers with registers. We also study the memory required when evaluating visibly pushdown transducers, in particular whether the stack is required, and if so, whether the memory can be bounded by the degree of nesting of the input word.Second, we study the algebraic properties of functional transductions. A central result is an algorithm that takes a one-way transducer (or a bimachine) as input, and decides whether it belongs to a given decidable congruence class (for instance, aperiodic congruences). A transfer theorem between algebra and logic permits to relate congruence classes with logics. For instance, aperiodic congruences characterize exactly transductions definable in first-order logic. We extend this result to infinite words for the special case of aperiodic transductions. As a consequence, it is decidable whether a rational transduction is first-order definable, for the cases of finite and infinite words
    corecore